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Abstract 



^ I As well known the rotation distance D{S, T) between two binary trees 

S, T oi n vertices is the minimum number of rotations of pairs of vertices 
to transform S into T. We introduce the new operation of chain rotation 
on a tree, involving two chains of vertices, that requires changing exactly 
three pointers in the data structure as for a standard rotation, and define 
the corresponding chain distance C{S,T). As for D{S,T), no polynomial 
time algorithm to compute C{S,T) is known. We prove a constructive upper 
bound and an analytical lower bound on C{S,T) based on the number of 
Q ■ maximal chains in the two trees. In terms of n we prove the general upper 

bound C{S,T) < n — 1 and we show that there are pairs of trees for which 
this bound is tight. No similar result is known for D{S,T) where the best 
' upper and lower bounds are 2n — 6 and |n — 4 respectively. 

^ ■ Keywords: Binary tree, Rotation distance, Chain distance. Upper and lower 
• ! bounds. Design of algorithms. 

O 

1 A new definition of tree distance 

Consider a rooted binary tree T of n vertices identified with the integers from 1 to 



X ■ 

^ . n in infix order as for a binary search tree (in the following the term tree always 
refers to trees of this form). A subtree of T is a tree rooted in a vertex v oi T 
and containing all the descendants of v in T. The vertices of W correspond to an 
integer interval, e.g., in the tree T of Figure 1 the subtree rooted at 7 corresponds to 
the interval [4,8]. A rotation of two adjacent vertices is an operation preserving the 
infix order through the change of three pointers. In Figure 1 the rotation between 
vertices 5 and 7 produces a tree T' where the right pointers of 3 and 5, and the left 
pointer of 7, have been changed. The inverse rotation between 7 and 5 transforms 
T' into T. 

Rotations were originally defined to keep binary search trees balanced. In [2] 
Culik and Wood have defined the rotation distance D{S, T) between two trees S and 
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Figure 1: A rooted binary tree T and the effect of a vertex rotation. 



T as the minimum number of rotations needed to transform S into T. D{S, T) has 
then been adopted in combinatorics as a standard measure of distance between trees, 
and has a role in computational biology where a comparison between evolutionary 
trees is done on the basis of subtree transfer |3J. A transformation requiring D{S,T) 
rotations is called optimal. 

A constructive upper bound D{S,T) <2n — 2 was given in p] and was improved 
to 2n — 6 in the seminal work of Sleator, Tarjan, and Thurston [9] where the authors 
transformed the problem into polygon transformation via diagonal flips. Since then 
a rich literature has appeared on the subject, nevertheless no efficient algorithm 
has been proposed to determine D{S,T), nor it is known whether the problem is 
NP-hard. In particular interesting estimates for D[S,T) have been given in [TllH], a 
significant approximation algorithm has been proposed in [3], and other works have 
been directed to establish significant lower bounds |4i)^- AH in all rotation distance 
is a classical topic and has led to many elegant results. 

An important concept is the one of equivalent edges, that is pairs of edges, one 
in S and one in T, whose deletion splits both S and T in two parts 5*1, 5*2 and Ti, T2 
where Si,Ti are subtrees of S,T containing the same subset of vertices (hence cor- 
respond to the same integer interval), and S2,T2 containing the remaining vertices. 
In Figure 1 the edges (3,7) of T and (3,5) of T' are equivalent, with the resulting 
subtrees of T and T' respectively rooted in 7 and 5 and corresponding to the interval 
[4,8]. The remaining portions of T,T' contain vertices 1, 2, 3, 9, 10. If a pair of 
equivalent edges exists, any optimal transformation of S into T can be done inde- 
pendently on 5*1,^1 and S2,T2. Note that the equivalent edges can be determined 
in linear time. Letting e denote the number of pairs of equivalent edges, the two 
trees are split accordingly into e + 1 pairs of trees to be processed independently. It 
has been proved that, for e = 0, at least n — 1 rotations are needed to transform S 
into T [H [6], then the lower bound D{S, T) > n — e — 1 follows. We assume that 
the splitting of 5", T has been done beforehand, then we shall work on trees without 
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equivalent edges. 

Rotations were defined in the field of data structures with the unquestionable 
merit of being local operations that require exactly three pointer changes. This 
implies the replacement of two vertices (5 and 7 in Figure 1) and one subtree transfer 
(subtree [6], here composed of one vertex, migrates from right subtree of 5 to left 
subtree of 7). We now propose a more general operation called c-rotation, where c 
stands for chain, that is done on chains instead of single vertices and also requires 
three pointer changes and one subtree transfer. A standard rotation is a special case 
of c-rotation if the chain contains only one vertex. We let: 

Terminology and notation. A left chain [u-v], u > v, in a. tree T is a sequence 
of vertices connected to one another with left pointers, from u (the highest) to v 
(the lowest). A maximal left chain is such that no other left chain contains it. The 
complete left chain [n-1] contains all the vertices linked with left pointers in the 
order n, n-1, 1. A right chain, a maximal right chain, and the complete right 
chain [1-n] are similarly defined. A left or right chain containing only one vertex u 
is denoted by [u] . Lj- and Rt respectively denote the number of maximal left chains 
and of maximal right chains in T. 

In the tree T of Figure 1: [7-4] is a maximal left chain; [5-4] is a not maximal 
left chain, being contained in [7-4]; [5-6] and [4] are maximal right chains. We have 
Lt = 5 and Rt = 6. We have: 

Proposition 1 In a tree T of n vertices we have Lt + Rt = n + 1. 

Proposition [1] is proved by simple induction on n. The basis is ri = 1, for which 
we have Lt = 1 and Rt = 1. Letting the proposition to be true for n — 1, insert 
a new leaf f in T as a child of an existing vertex u. If f is a left child of u, Lt is 
unchanged and Rt is increased by 1. If is a right child of u, Rt is unchanged and 
Lt is increased by 1. Then the proposition is true for n. Similarly note that Lt 
and Rt are respectively equal to the numbers of non-null right and left pointers in 
T plus 1. As a consequence the values of Lt, Rt can be computed in 0{n) time in 
a tree traversal. 

As for standard rotations, a c-rotation can be inverted. If needed we shall dis- 
tinguish between direct and inverse c-rotations, defined as follows: 

Definition 1 A (direct) c-rotation Tot{[u-v],vu) in a tree T, where [u-v] is a left 
chain and u is the right child of w, is a local operation where: (i) u takes the place 
ofw (i.e. u becomes a child of the parent x ofw, if any); (ii) w becomes the left child 
of v; and (Hi) the left subtree ofv, if any (i.e., if [u-v] is not maximal), becomes the 
right subtree of w. The definition also holds for a right chain [u-v] exchanging the 
terms "left" and "right" whenever they occur. 

In the tree T now repeated in Figure 2, the c-rotation rot ([7-5] ,3) produces the 
tree T". Note that a direct c-rotation merges two chains into one, and [u-v] may be 
a maximal or a non maximal chain. 
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Figure 2: A direct c- rotation on the tree T of Figure 1 merging the chains [7-5] with 



Definition 2 yln (inverse) c-rotation rot(uy'.[M-i']) in a tree T. where [u-v] is a left 
chain and w is left child of v (then w is in the same left chain of [u-v]), is a local 
operation where: (i) w takes the place of u (i.e. w becomes a child of the parent x 
of u, if any) and u becomes the right child of w; and (ii) the right subtree ofw, if 
any, becomes the left subtree of v. Again the definition also holds for a right chain 
[u-v] exchanging the terms "left" and "right" whenever they occur. 

In Figure 2 the inverse c-rotation rot(3,[7-5]) in T" produces the tree T again. 
Note that an inverse rotation sphts a chain in two, and the chain [u-v] cannot be 
maximal. We immediately have: 

Proposition 2 In a (direct or inverse) c-rotation three pointers change. Namely, 
forrot{[u-v],w), where x is the parent ofw if any, we change: (i) the pointer from x 

to w (or the outside pointer to the tree if w is the tree root); (ii) the left (respectively 
right) pointer of v; (Hi) the right (respectively left) pointer ofw. Forioi{w ,[u-v\) we 
change: (i) the pointer from x to u (or the outside pointer to the tree if u is the tree 
root); (ii) the right (respectively left) pointer ofw; (Hi) the left (respectively right) 
pointer of v. 

In rot ([7-5], 3) of Figure 2 the left pointer of the parent a; = 9 of t« = 3 now points 
to w = 7; the left pointer oi v = b now points to w = 3; the right pointer oi w = 
now points to the left child Aolv — b. The effect of the inverse rotation rot(3,[7-5]) 
is specular. Note that if [u-v] is a maximal chain v has no left (respectively right) 
child, then in a direct rotation mi{\u-v],w) the right (respectively left) pointer of w 
becomes null. 



[9-1]. 
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Definition 3 Given two trees S,T of n vertices, the chain distance C{S,T) is the 
minimum number of c-rotations needed to transform S into T . 

As a trivial example we have C{T,T") = 1 in Figure 2, however, determining 
the chain distance in the general case is a hard problem. 

2 Upper and lower bounds on C{S,T) 

We start giving a transformation algorithm between two trees S, T based on c- 
rotations. The possibility of inverting a c-rotation suggests a strategy often used for 
regular rotations, e.g. see [5]. First it is decided how to transform both 5' and T 
into a proper target tree Z. Then the overall procedure will be — )■ Z — )■ T, where 
Z — )■ T is done by inverting the c-rotations of T — )■ Z and applying them in opposite 
order. C{S, T) is upper bounded by the sum of c-rotations in 5* — )■ Z and T — )■ Z. 
The target tree chosen here is either the complete left chain [n-1] or the complete 
right chain [l-n]. 

In Figure 3 we show the structure of two algorithms for transforming a tree Y of 
n vertices into the chain [n-1] (ROTLEFT), or into the chain [1-n] (ROTRIGHT). 
Clearly both algorithms can be implemented to run in linear time. We have: 

Proposition 3 C{S, T) < imn{Ls + Lt - 2, Rs + Rt - 2). 

Proposition [3] has a simple constructive proof. The transformation — )■ T can 
be performed as a combination of S — )■ and T — )■ [n-1], or as a combination 
of 5 ^ [1-n] and T [1-n], using ROTLEFT or ROTRIGHT. Since the number 
of c-rotations executed by ROTLEFT (F) and by ROTRIGHT (F) are Ly - 1 and 
Ry — 1 respectively, the proposition follows. Since ROTLEFT and ROTRIGHT 
use only direct c-rotations, the overall transformation S ^ Z ^ T will consists of 
direct c-rotations in S Z and inverse c-rotations in Z — )■ T. We can now derive 
an upper bound on C{S,T) as a function of n, namely: 

Proposition 4 C{S, T) <n-l. 

Proof. For proving an upper bound valid for all trees we must compute the max- 
imum of the function / = min{Ls + Lt — 2, Rs + Rt — 2) given in Proposition [3] 
under the variation of the parameters involved. Recalling from Proposition [1] that 
Lt + Rt = n -|- 1 for any tree T and letting Ls + Lt = a, we can reformulate the 
function as / = min{a — 2, 2n — a) with a growing linearly in the range 2 < a < 2n. 
It is now easy to prove that / is maximized for a = n -t- 1 and the given bound 
follows. Q.E.D. 

For proving a lower bound on C{S,T) we must rely on the properties of c- 
rotations. Working on the maximal chains of S, T we have: 
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algorithm ROTLEFT(y): 

while ( 3 more than one left chain in Y) 

merge a maximal left chain [u-v] with the chain containing the parent w oi u 
by applying vot{[u-v],w). 

algorithm ROTRIGHT(y): 

while ( 3 more than one right chain in Y) 

merge a maximal right chain [u-v] with the chain containing the parent w u 
by applying rot{[u-v],w). 

Figure 3: The algorithms ROTLEFT and ROTRIGHT for transforming a tree Y 
into a complete chain. 



Proposition 5 C{S,T) >\Ls-Lt\. 

Proof. In a tree Y a direct c- rotation rot{[u-v],w), where [u-v] is a maximal left 
(respectively right) chain, induces the changes Ly = Ly — 1 and Ry = Ry + 1 
(respectively Ly = Ly + 1 and Ry = Ry — 1). If instead [u-v] is not maximal the 
values of Ly and Ry remain unchanged (for example see the c-rotation in Figure 
2). An inverse rotation Tot{w ,[u-v]) , where [u-v] is a left (respectively right) chain 
and w has no right (respectively left) child, induces the changes Ly = Ly + 1 and 
Ry = Ry — 1 (respectively Ly = Ly — 1 and Ry = Ry + 1). If instead w has a right 
(respectively left) child the values of Ly and Ry remain unchanged (for example 
invert the rotation in Figure 2). Then a c-rotation may change the value of Ly by at 
most one. The given bound immediately follows because, after the transformation of 
S into T, the two trees must have the same number of maximal left chains. Q.E.D. 

Note that from Proposition [1] we have Ls = n + 1 — Rs and Lt = n + 1 — Rt, 
hence Ls — Lt = Rt — Rs- This implies that \Ls — Lt\ can be replaced with 
\Rs — Rt\ in Proposition 13 There are pairs of trees without equivalent edges for 
which the lower bound of Proposition O is particularly significant. For the trees of 
Figure 4, for example, we have Ls = c and Lt = n, hence C{S,T) > n — c where 
c is an arbitrary integer constant, l<c<?7, — 1. In fact there are many ways of 
building pairs of trees with \Ls — Lt\ = n — c for arbitrary values of c. In particular 
for c = 1 we have: 

Proposition 6 There are pairs of trees S, T without equivalent edges for which 
C{S,T) >n-l. 

The lower and upper bounds of Propositions H] and [6] are tight. A comparable 
result is unknown for the standard rotation distance D{S, T) where the upper bound 
2n — 6 must be compared with the highest known lower bounds ^n — A or 2n — G(i/n) 
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Figure 4: Two trees meeting the lower bound C{S,T) > n — chj Proposition O 



proved in [4J. It is also worth noting that for the trees of Figure 4 we have C{S, T) < 
Rs + Rt — 2 = n — c by Proposition [3l matching the lower bound shown in the figure 
for any value of c. 

3 Concluding remarks 

Standard rotations have been mainly studied in the framework of data organization 
and computational biology. When considered as a measure of tree distance, however, 
different rules may apply to different cases. As an example, finding pairs of trees 
that meet the upper bound oi 2n — 6 is outside the realm of balanced search trees 
where long chains of pointers prevent such a bound to be met [5]. Therefore it 
seems reasonable to investigate other concepts of distance going beyond standard 
rotations. In this respect we have proposed chain distance here. In fact we believe 
that a possible merit of this note, if any, is stimulating a discussion on the concept 
of distance between trees. 

When looking for alternatives we must observe three main properties, fulfilled 
both by rotations and chain rotations. First the transformation rely on subtree 
transfer and on the replacement of a constant number of vertices; then an invariant 
is maintained in the tree (the infix ordering of vertices in our case); and finally 
the basic operation requires constant time (rotation or c-rotation require changing 
three pointers). An alternative to rotations and c-rotations could be moving a single 
vertex v along a chain with a jump of any length to insert v above one of its ancestors 
w, thus defining a long distance L{S,T) between two trees. It can be easily seen 
that also in this case one subtree is relocated, the infix ordering of the vertices is 
maintained, and only three pointers are changed. Again D{S,T) is a special case of 
L{S, T) if only vertex jumps of length one are allowed. Upper and lower bounds on 
the distance should be established in this case. 

In fact, a wealth of possibilities is open. 
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